An investigation of acoustic models for multilingual code-switching

نویسندگان

  • Christopher M. White
  • Sanjeev Khudanpur
  • James K. Baker
چکیده

Multilingual speech processing continues to develop as speech technology spreads to heterogeneous clients and applications. We address a distinct problem of code-switching — the spontaneous but occasional use, within speech in one language (referred to as L1), of words, phrases, expressions or idioms from a second language (L2). We examine two alternatives for modeling the acoustics of such words: creation ofL1 pronunciations for the out-of-language (OOL) words for use with L1 acoustic models, and retention of their L2 pronunciations for use with multilingual acoustic models. We test the hypothesis that the latter is a better acoustic model for OOL words. We develop a set of lexica in IPA form, a global phoneme inventory, and handle the problem of L2 word pronunciation by creating linguistically motivated pairwise mappings. We show that retention of L2 pronunciations with multilingual acoustic models better explains the observations when restricted to a forced alignment.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Motivational Determinants of Code-Switching in Iranian EFL Classrooms

“Code-Switching”, an important issue in the field of both language classroom and sociolinguistics, has been under consideration in investigations related to bilingual and multilingual societies. First proposed by Haugen (1956) and later developed byGrosjean (1982), the termcode-switching refers to language alternation during communication. Although code-switching is unavoidable in bilingual and...

متن کامل

Speech Recognition on English-Mandarin Code-Switching Data using Factored Language Models - with Part-of-Speech Tags, Language ID and Code-Switch Point Probability as Factors pdfsubject=Multilingual Speech Recognition

Code-switching is defined as ”the alternate use of two or more languages in the same utterance or conversation” [1]. CS is a wide-spread phenomenon in multilingual communities, where multiple languages are concurrently used in a conversation. For automatic speech recognition (ASR), particularly intra-sentential code-switching poses an interesting challenge due to the multilingual context for la...

متن کامل

Language-dependent State Clustering for Multilingual Speech Recognition in Afrikaans, South African English, Xhosa and Zulu

The development of automatic speech recognition systems requires significant quantities of annotated acoustic data. In South Africa, the large number of spoken languages hampers such data collection efforts. Furthermore, code switching and mixing are commonplace since most citizens speak two or more languages fluently. As a result a considerable degree of phonetic cross pollination between lang...

متن کامل

Dependency Parsing of Code-Switching Data with Cross-Lingual Feature Representations

This paper describes the test of a dependency parsing method which is based on bidirectional LSTM feature representations and multilingual word embedding, and evaluates the results on monoand multilingual data. The results are similar in all cases, with a slightly better results achieved using multilingual data. The languages under investigation are Komi-Zyrian and Russian. Examination of the r...

متن کامل

Implications of Sepedi/English code switching for ASR systems

Code switching (the process of switching from one language to another during a conversation) is a common phenomenon in multilingual environments. Where a minority and dominant language coincide, code switching from the minority language to the dominant language can become particularly frequent. We analyse one such scenario: Sepedi spoken in South Africa, where English is the dominant language; ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008